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Abstract- Enterprise information technology is changed rapidly 
and it has also become the integral part and become a strategic 
asset to business. New technology coming today and adoption of 
it are helping IT applications to run faster, provide more efficient 
way for greater connectivity and creating newer business avenues 
for the organization. In all these changes there has been a 
constant. The constant is the underlying need to build and run a 
resilient IT that can recover on demand and be available 
continuously. This paper discusses some of the major issues and 
challenges that should need to address to run the business in 
today's scenario for business continuity management and IT 
disaster Recovery Management. 
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I. 



Introduction 



It is known to everyone that money attracts growth. To earn 
more money it is needed to secure mature infrastructure for 
information technology not only for right information within a 
very short time but also good disaster recovery plan also. A 
predictable and repeatable IT DR plan is the most 
consequential antidote to application outages. In the first half 
of 2011, the aftermath of disasters, such as the massive 
cyclones that struck the Australian state of Queensland, the 
catastrophic tsunamis that struck the northern portion of Japan 
following a major earthquake, and the hundreds of tornadoes 
that struck the central and south eastern portion of the US 
served to further reinforce to the business community at large 
the strong need for disaster recovery and business continuity 
readiness. In case of centralized information system it is 
needed to be implemented and managed effectively, a business 
continuity management program that can create a single place 
in the organization where mission critical business and 
technology processes are documented and managed. In 1991 
restructuring were made to make the private sector from the 
government controls and also to improve the fiscal system. The 
experts feel that the focus was made to reform areas under 
agriculture, urban, human resources development and 
managing public services. So more and improvement is made 
by the help of coordination between private and public sector 
then information technology plays a major role and change the 
diversity of the business for competing with the global market . 
Now a day's disaster recovery solutions are the most integral 
part for the business organization also [1, 2, 4]. 

II. Disaster recovery management and its impact in 

TODAY'S BUSINESS: 

As more and more expansion of business is going on it is 
clear that best practices to monitor and manage IT recovery is 



needed. Disaster recovery solution basically consists of several 
physical subsystems and there should be a logical relationship 
between them. Physical subsystems are address to servers, 
applications, data replication, networks and storage (primary or 
secondary), primary data center and disaster recovery data 
center. Logical relationship should include order of recovery, 
independency between components and actions required to 
recover a subsystems. Disaster recovery management combines 
all these subsystems and relationships and provides a 
harmonization of IT system recovery. There are several 
common myths about IT recovery readiness. One of the biggest 
misconceptions is if the data replication is in place, then the 
application is recovery ready. Data replication is one of the 
important parts of the recovery readiness. The others include 
process, people and integration with technology subsystems on 
primary and DR site. All of these are required for performing 
predictable recovery. Whatever the business strategy the 
organization are used to survive in today's global competitive 
market but to assess organization's recovery readiness, they 
should be answer the following questions [3,7,9]. 




Figure 1 . Disaster Recovery Life Cycle 

1 ) Whether there should be a defined recovery point and 
recovery time objectives for the organization for all critical 
applications? 

2) When the last disaster recovery drill has been done and 
was it successful also ? 

3) When performing the disaster recovery drill, whether it 
is found that the run book to be out of order for 
synchronization with the current configuration ? 
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4) Whether disaster recovery drills delaying due to non 
availability or insufficient resources and also worried about 
impact to production ? 

5) Whether it is common to perform the DR drill for one, 
two or more applications. If it is then drill has been performed 
for several applications together and which will be required in 
the event of a larger outage? 

6) Whether a management is interested in weekly or daily 
report on application recovery readiness status and recovery 
service line agreement (SLA) report? 

7) Whether the reports and evidence are sufficient to show 
to audit and regulators about application 's disaster recovery 
capability? 

8) Whether the DR recovery will work smoothly if 
database administrator quits from the organization? 

When answering the above questions it clear to understand 
that the capabilities of the disaster recovery and its challenges 
of having a disaster recovery solution that will work as when 
required [10, 14, 16]. 

III. Major challenges for disaster recovery 

Business is suffered and creates a significant impact in the 
market when the critical applications are not available and 
cannot be recovered within business set recovery times. Major 
challenges are faced by the organizations in today and their 
groups are as follows: 

A. Production downtime: 

Manual drills and unpredictable outcomes cause a critical 
application is to be down for some time which also impacting 
the business. 

B. Deployment and operational cost: 

when every deployment of DR becomes a professional 
services engagement, cost and project times are also varying 
according to the requirement occurs. 

C. Lack of visibility into SLA 's: 

when management and IT operations does not know if the 
recovery solution are meting service levels then it creates a 
lack of confidence in the operation and also reduced return on 
investment. 

D. Manual operations: 

when dependent on people to execute recovery procedures 
or steps at the time of emergency, crisis exposes the business in 
to more risks. People tend to make more mistakes when 
performing in case of crisis solution. 

E. Need for DR expertise: 

A typical enterprise uses heterogeneous technologies. In 
that case dependent on a single person or a dashboard to 
monitor and provide the recovery solution and drill steps, the 
organization dependent on the various technology experts to be 
available 
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Figure 2. Typical server hierarchy in network 

IV. Disasters recovery management - a challenging 

APPROACH FOR IT RECOVERY: 

IT disasters recovery management is a challenging and 
emerging discipline that enables IT to meet business set 
recovery objectives. In today's global and competitive 
business market without IT DRM, IT recovery is largely 
manual, highly expertise dependent and with little visibility 
into how well recovery services levels are being met to survive 
in the business. Every DR solution for any application must be 
designed to meet certain key DR metrics and also consists of 
DR process that must be covered by a DR solution [13, 15, 17]. 

A. Key DR Metrics 

a) Recovery point objectives: 

It is the amount of application data in time that an 
organization can afford to lose before it adversely impacts the 
business but it is also dependent on the organization whether it 
allows or not . For example a bank that cannot afford to lose 
any data for its ATM application, hence its RPO is zero. 

b) Recovery time objectives: 

It is the amount of time an application can be down due to 
recovery purpose. For example an application with a RTO of 
two hours that must be recovered within two hours and after it 
becomes unavailable due to an outage. 




Figure 3. Backup policies in DR 
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c) Data log: 

This will specify the amount of data that the disaster 
recovery site is having in behind the primary production site. 
The unit of data log is dependent on the unit are used to 
measure the size of the data. It can be dependent on the 
technology deployed to replicate data and is usually measured 
in the MB or GB or the number of files. 




Figure 4. Typical disasters recovery cycle 



B. DR Process: 

There are several processes that are part of the disasters 
recovery solution and that must be thought out and designed 
for to achieve better recovery plan. There must be a run book 
that has a series of predefined steps that must be consulted for 
each of the DR processes. 

• Provision: 

Deploy the best practices for DR solution for the 
application by deploying best of the DR infrastructure, and best 
practice, procedures recommended by application vendors for 
the best options. 

• Monitoring: 

Perform a real time monitoring of DR metrics and its 
parameters to ensure the objectives of the DR systems are met 
and it healthy in operational. 

• Validation: 

Perform daily /weakly configuration checking to ensure 
that DR systems are up-to-date with production systems with 
regards to ongoing change management updates. 

• Test/DR Drills: 

Perform quarterly and half yearly DR drills which include 
switchover and switchback on the application at the DR site 
and also validate the DR operational readiness capability. Here 
switchover basically means when production is brought down 
and services are made available from the DR site. The business 
user basically tests the application that has come up with the 
DR system. 



The switchback process moves services back to the 
production and normal copy process resumes. 

• Failover Recovery: 

Document and automate the application recovery steps 
which include failover and fallback procedures for different 
scenarios. Recover the applications successfully within the 
recovery time objectives, when invoked under crisis is 
occurred. When an outage occurs on the production, then the 
corresponding failover process is invoked to recover desired 
services on the DR. after the cause of the outage has been 
identified the fallback process covers the steps to move the 
services back to the corresponding production site. 

• Reports: 

Provide yearly audit and compliance report on DR drills 
and other DR activities to meet regulatory requirements. 
Furnish weekly reports on DR status to the management and 
applications owners in the organization to ensure the 
acceptance of DR systems. 

V. Capabilities of the IT DRM software: 

IT DRM software must offer better capabilities to monitor 
and report on DR metrics as well as provide automation of all 
of the DR processes. The solution must offer the following 
mentioned capabilities 

A. Monitoring and validation of recovery service levels: 

Recovery point and recovery time are metrics that need to 
be monitored for a DR solution. Real time monitoring of 
Recovery point objectives (RPO) and Recovery time objectives 
(RTO) should ensure that the applications are meeting their 
recovery objectives with full potential. 

B. Automation of failover and DR Drill processes: 

In the life cycle of DR solution there are several stages 
requiring several steps to be performed to successfully secure 
the business data. Failover is a series of steps that bring up the 
application on the DR site when the primary site has gone 
down due some unavoidable circumstances. Switchover is a 
series of steps that shuts down the primary and brings up the 
DR in a planned manner for better operational characteristics of 
business. Automation of these steps ensures the DR process 
takes place in predictable and reliable manner [18, 20, 21]. 

C. Unified management approach that takes an application 
view of recovery : 

In that case application recovery requires the various 
components including operating systems, network, storage, 
data protection and applications are recovery ready. A unified 
approach is to be applied to ensure that the interface can help 
and manage a complete view that includes event management 
across the stack. 

D. Analytics and reporting for compliance and regulatory 
purposes : 

Regulatory authorities require evidence of control that also 
demonstrates that drills have been conducted. RPO and RTO 
trending reports that helps the IT managers identify saturation 
of resources like network bandwidth and specially draw focus 
on the recovery steps that are also time consuming and 
laborious also [4,5, 10]. 
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VI. DRM AND ITS BENEFITS AND IMPACTS IN BUSINESS: 

Recovery manager is built upon a powerful automation 
engine that understands and combines dependencies that 
required for successful recovery applications. A central web 
based console offers an easy way to collaborate execution and 
status tracking of recovery actions as they execute and fully 
operational [14, 16, 18]. DRM enables the following business 
benefits which are as follows 

A. Reduce business exposure to IT outages: 

Non availability of IT applications poses a huge business 
risk. But this think can be easily overcome by ensuring IT 
recovery readiness. Which enables information technology to 
recover applications within the business set recovery objectives 
and thus minimizing the impact of IT outages. 

1 ) Achieve higher operational efficiency: 
By means of cheaper cost, faster in execution and recover 
an IT applications in a scalable manner. 




2) Adopt industry best practices : 

Successful recovery is achieved by help of expert IT 
professional, mature process and robust technologies. IT 
recovery process easily integrates with best practices of 
incidents, configuration, and change, audit and service level 
management processes. 

3) Validate complex recovery for multi-vendor physical / 
virtual environments. 

4) Real time checking into application data loss and 
recovery time. 

5) Rapidly identify causes of recovery test failures and 
identify the appropriate solutions. 

6) Designing the recovery workflows in such a way that 
meet the service levels, RPO and RTO objectives very 
efficiently. 

7) Global recovery audit reporting and documentation. 



VII. Comparison for cost of doing DR operation for traditional approach Vs DR management software: 



Traditional Approach 


DR Management Software 


Personal required for DR where DR team works 
on DR strategy, plan readiness, co ordination 
amongst various teams and DR readiness reporting 
process. 


The DR team size does not need to be increase in size with the 
addition of number application under DR. central web based 
console for coordination and inbuilt DR best practices help the 
team manage more applications. 


IT process of change, policy, backup 
management that is implemented on the primary 
site must also be implemented on the DR site also. 
Event monitoring, exception reporting and service 
line agreement compliance and analytics are part of 
DR process. 


Real time monitoring of DR health and data replication, 
validation of primary and DR environment equivalence and 
exception reporting when DR SLA's are not being met and also 
drive down the operational cost heavily while increasing DR 
readiness. 


DR automation which also involves automation 
steps for doing DR drill and failover recovery. 


Using DR automation it is found that there will be significant 
reduction in people cost of doing DR drill. 



VIII. Types of recovery services and its impacts 

A. Data dependency mapping technology 

This is basically are software products that determine and 
report on the likelihood of achieving specified recovery 
targets based on analyzing and correlating data from 
applications , databases , clusters ,OS, virtual systems, 
networking and storage replication mechanisms. One 
technology approach also being taken by different vendors is 
the use of well defined storage management problem 
"signatures" supported by industry standard storage and data 
management software, in combination with the passive 
traffic monitoring of local and remote storage traffic. 

B. Recovery exercising: 

This is basically set of sequenced testing tasks typically 
performed at a recovery data center facility that are focused 
on ensuring that the access and usage of a production 
application can be restarted within a specified time period 
with the required level of data consistency and an acceptable 
level of data loss. As the recovery scope of mission critical 
business processes, applications and data increases, however 
sustaining the quality and consistency of recovery exercises 
can be a daunting technical and logistical challenge, 



especially as the frequency with which recovery exercises 
are held increases, in addition to increased change frequency 
also. 

C. Cloud based recovery services 

These services are delivered by public cloud 
providers and are primarily infrastructure as a service. These 
include recovery in the cloud and also cloud based storage 
services. This service basically supports a combination of 
server image and production data backup to the service 
provider's data center. When the access to the replicated 
server images and production data is required by the 
customer for plan exercising or to support live recovery 
operations, the server images are dynamically restored to 
available hardware and reactivated. 

The recovery in the cloud value proposition is twofold. 
First, because server restoration on demand does not require 
any pre allocation of specific computing resources and 
provider customers have the opportunity to exercise their 
recovery plans more frequently. Second, because server 
images are restored to providers' server hardware when 
needed, and production data has already been stored inside 
the provider cloud, the need for either shared-subscription or 
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dedicated server and storage equipment can be significantly 
reduced, if not totally eliminated. 

D. Virtual Machine Recovery 

It focuses on protecting and recovering data from VMs, 
as opposed to the physical server the VMs, as run on or non 
virtualized systems. Sever virtualization for the X 86 
platforms from vendors like VMware, Citrix systems and 
Microsoft is gaining considerable attention and the 
deployment of X86 VMs is growing at roughly 50% every 
year. 

VM recovery solutions help recover from problems 
including user, application or administrator error, such as the 
accidental deletion or overwrite of a file, logical errors such 
as viruses, physical errors, such as disk failures, and disaster 
recovery errors, such as site loss. Further it can also offer 
improved granular recovery of data/ files in a VM 
environment, in addition to the entire VM. 

IX. Conclusion 

From the above discussion it can be easily understand the 
actual power of disaster recovery management. In business 
critical applications are meeting RPO goals. Providing 
timely alerts when the RPO deviates to enable timely re 
mediation that ensures critical application are always 
recovery ready. The primary reason organization deploy 
disaster recovery solutions is to reduce the financial impact 
of IT outages. DR automation drives the return on 
investments of DR management software in two key areas. 
DR automation enables scaling of DR drills as more 
applications need to be tested at regular times. Secondly with 
a single button failover recovery of applications can increase 
the confidence to invoke application on the DR site 
dramatically and resulting in reduced downtime and also 
higher utilization of DR assets. 
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